39 research outputs found
Recommended from our members
Metrical Grids and Generalized Tier Projection
This paper formalizes metrical grid theory (MGT, Prince, 1983; Hayes, 1995) and studies its expressive power. I show that MGT analyses of a certain form can describe stress systems beyond the input tier-based input strictly local functions proposed by Hao and Andersson (2019), but conjecture that such analyses do not describe systems beyond the input tier-based strictly local languages of Baek (2018). These results reveal fundamental differences between the three formalisms
Recommended from our members
Learnability and Overgeneration in Computational Syntax
This paper addresses the hypothesis that unnatural patterns generated by grammar formalisms can be eliminated on the grounds that they are unlearnable. I consider three examples of formal languages thought to represent dependencies unattested in natural language syntax, and show that all three can be learned by grammar induction algorithms following the Distributional Learning paradigm of Clark and Eyraud (2007). While learnable language classes are restrictive by necessity (Gold, 1967), these facts suggest that learnability alone may be insufficient for addressing concerns of overgeneration in syntax
Rhythmic Syncope in Subregular Phonology
Rhythmic syncope describes the deletion of vowels in an alternating rhythmic pattern, so that every other underlying vowel deletes. We informally summarize a proof that rhythmic syncope cannot be represented by a strictly local function over segments. Rather, rhythmic syncope can only be generated by a strictly local function if input and output symbols are synchronized, so that locality can be computed over both the input and output value at a particular time step. This structural property may only be needed to describe rhythmic syncope, which means that before concluding that human phonology can compute such functions, it is essential to verify the extent to which rhythmic syncope is attested as a stable and productive synchronic pattern
Action-Sensitive Phonological Dependencies
This paper defines a subregular class of functions called the tier-based
synchronized strictly local (TSSL) functions. These functions are similar to
the the tier-based input-output strictly local (TIOSL) functions, except that
the locality condition is enforced not on the input and output streams, but on
the computation history of the minimal subsequential finite-state transducer.
We show that TSSL functions naturally describe rhythmic syncope while TIOSL
functions cannot, and we argue that TSSL functions provide a more restricted
characterization of rhythmic syncope than existing treatments within Optimality
Theory.Comment: To appear in the Proceedings of the 16th SIGMORPHON Workshop on
Computational Research in Phonetics, Phonology, and Morpholog
Computing Vowel Harmony: The Generative Capacity of Search & Copy
Search & Copy (S&C) is a procedural model of vowel harmony in which underspecified vowels trigger searches for targets that provide them with features. In this paper, we seek to relate the S&C formalism with models of phonological locality proposed by recent work in the subregular program. Our goal is to provide a formal description, within the framework of mathematical linguistics, of the range of possible phonological transformations that admit an analysis within S&C. We show that used in its unidirectional mode, all transformations described by an S&C analysis can be modeled by tier-based input strictly local functions (TISL). This result improves the previous result of Gainor et al 2012, which showed that vowel harmony processes can be modeled by subsequential functions. However, non-TISL transformations can be given S&C descriptions in the following ways. Firstly, since TISL functions are not closed under composition, a non-TISL vowel harmony pattern may be obtained by applying two S&C rules sequentially. Secondly, when S&C is used in its bidirectional mode, it has the ability to describe transformations that cannot be modeled by finite-state functions
MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion
Query expansion is a commonly-used technique in many search systems to better
represent users' information needs with additional query terms. Existing
studies for this task usually propose to expand a query with retrieved or
generated contextual documents. However, both types of methods have clear
limitations. For retrieval-based methods, the documents retrieved with the
original query might not be accurate enough to reveal the search intent,
especially when the query is brief or ambiguous. For generation-based methods,
existing models can hardly be trained or aligned on a particular corpus, due to
the lack of corpus-specific labeled data. In this paper, we propose a novel
Large Language Model (LLM) based mutual verification framework for query
expansion, which alleviates the aforementioned limitations. Specifically, we
first design a query-query-document generation pipeline, which can effectively
leverage the contextual knowledge encoded in LLMs to generate sub-queries and
corresponding documents from multiple perspectives. Next, we employ a mutual
verification method for both generated and retrieved contextual documents,
where 1) retrieved documents are filtered with the external contextual
knowledge in generated documents, and 2) generated documents are filtered with
the corpus-specific knowledge in retrieved documents. Overall, the proposed
method allows retrieved and generated documents to complement each other to
finalize a better query expansion. We conduct extensive experiments on three
information retrieval datasets, i.e., TREC-DL-2020, TREC-COVID, and MSMARCO.
The results demonstrate that our method outperforms other baselines
significantly